Multilingual e-mail text processing for speech synthesis

نویسندگان

  • Daniela Oria
  • Akos Vetek
چکیده

An integrated method of text pre-processing and language identification is introduced to deal with the problem of mixed-language e-mail messages in a speech-enabled e-mail reading system. Our method can confidently distinguish between the supported languages and switch between several TTS engines or languages to read the portions of the text in the appropriate language. This is achieved by making use of the combined information from a text pre-processor and a language identifier that relies on both statistical information and linguistic features indicative of a particular language.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A multilingual text processing engine for the PAPAGENO text-to-speech synthesis system

Automatic synthesis of speech from arbitrary text requires two basic operations: linguistic analysis of input text and speech waveform generation. The achieved quality of the second stage very much depends on the reliability and richness of information generated in the first stage. In this paper we discuss possibilities and problems of text analysis for multilingual speech synthesis. The langua...

متن کامل

GlobalPhone: A Multilingual Text & Speech Database in 20 Languages

This paper describes the advances in the multilingual text and speech database GlobalPhone, a multilingual database of highquality read speech with corresponding transcriptions and pronunciation dictionaries in 20 languages. GlobalPhone was designed to be uniform across languages with respect to the amount of data, speech quality, the collection scenario, the transcription and phone set convent...

متن کامل

Introduction to multilingual corpus-based concatenative speech synthesis

This tutorial paper addresses foreign-language support in corpus-based concatenative text-to-speech systems. We give an overview of application domains where strictly monolingual speech synthesis is not sufficient and where multilingual text-to-speech is required or highly desirable. We describe two approaches to multilingual corpus-based speech synthesis: phoneme mapping on the one hand, and t...

متن کامل

Design and Evaluation of a SLDS for E-Mail Access through the Telephone

directed to make e-mail universally and seamlessly accessible to a broad population of potential users through an affordable telephone-based service. Thus, the main objective of E-MATTER was to develop a Spoken Language Dialogue System (SLDS) for an e-mail access service that uses a multilingual spoken language interface (both input and output) and that takes into account the cultural and the l...

متن کامل

Multilingual Spoken Language Corpus Development for Communication Research

Multilingual spoken language corpora are indispensable for research on areas of spoken language communication, such as speech-to-speech translation. The speech and natural language processing essential to multilingual spoken language research requires unified structure and annotation, such as tagging. In this study, we describe an experience with multilingual spoken language corpus development ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004